Open In Colab

Phase 1: Exploratory Data Analysis (EDA) on Spatio-temporal Epidemics (COVID-19)ΒΆ

Course: DAL-588 (Big Data and AI for Industry) Group Members-

  • Akriti Gulati - 24565002
  • Anindya Biswas - 24565003
  • Priya Gupta - 24565013
  • Arpit Kumar Chaudhary - 24566006
This notebook focuses on Phase 1 of the project, which involves Exploratory Data Analysis (EDA) on Spatio-temporal Epidemics (COVID-19). The dataset contains detailed information about dataset :ΒΆ
* Province/States
* Country/Region
* Lat
* Long
* Date
* Confirmed
* Deaths
* Recovered
* Active
* WHO Region

Installing Python LibrariesΒΆ

InΒ [Β ]:
! pip install calmap
! pip install folium
! pip install geoviews
!pip install cartopy
!pip install basemap
Collecting calmap
  Downloading calmap-0.0.11-py2.py3-none-any.whl.metadata (2.2 kB)
Requirement already satisfied: matplotlib in /usr/local/lib/python3.11/dist-packages (from calmap) (3.10.0)
Requirement already satisfied: numpy in /usr/local/lib/python3.11/dist-packages (from calmap) (2.0.2)
Requirement already satisfied: pandas in /usr/local/lib/python3.11/dist-packages (from calmap) (2.2.2)
Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib->calmap) (1.3.2)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.11/dist-packages (from matplotlib->calmap) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.11/dist-packages (from matplotlib->calmap) (4.58.0)
Requirement already satisfied: kiwisolver>=1.3.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib->calmap) (1.4.8)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.11/dist-packages (from matplotlib->calmap) (24.2)
Requirement already satisfied: pillow>=8 in /usr/local/lib/python3.11/dist-packages (from matplotlib->calmap) (11.2.1)
Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib->calmap) (3.2.3)
Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.11/dist-packages (from matplotlib->calmap) (2.9.0.post0)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.11/dist-packages (from pandas->calmap) (2025.2)
Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.11/dist-packages (from pandas->calmap) (2025.2)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.11/dist-packages (from python-dateutil>=2.7->matplotlib->calmap) (1.17.0)
Downloading calmap-0.0.11-py2.py3-none-any.whl (7.3 kB)
Installing collected packages: calmap
Successfully installed calmap-0.0.11
Requirement already satisfied: folium in /usr/local/lib/python3.11/dist-packages (0.19.5)
Requirement already satisfied: branca>=0.6.0 in /usr/local/lib/python3.11/dist-packages (from folium) (0.8.1)
Requirement already satisfied: jinja2>=2.9 in /usr/local/lib/python3.11/dist-packages (from folium) (3.1.6)
Requirement already satisfied: numpy in /usr/local/lib/python3.11/dist-packages (from folium) (2.0.2)
Requirement already satisfied: requests in /usr/local/lib/python3.11/dist-packages (from folium) (2.32.3)
Requirement already satisfied: xyzservices in /usr/local/lib/python3.11/dist-packages (from folium) (2025.4.0)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.11/dist-packages (from jinja2>=2.9->folium) (3.0.2)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.11/dist-packages (from requests->folium) (3.4.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.11/dist-packages (from requests->folium) (3.10)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.11/dist-packages (from requests->folium) (2.4.0)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.11/dist-packages (from requests->folium) (2025.4.26)
Collecting geoviews
  Downloading geoviews-1.14.0-py3-none-any.whl.metadata (8.5 kB)
Requirement already satisfied: bokeh>=3.6.0 in /usr/local/lib/python3.11/dist-packages (from geoviews) (3.7.3)
Collecting cartopy>=0.18.0 (from geoviews)
  Downloading Cartopy-0.24.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (7.9 kB)
Requirement already satisfied: holoviews>=1.16.0 in /usr/local/lib/python3.11/dist-packages (from geoviews) (1.20.2)
Requirement already satisfied: numpy in /usr/local/lib/python3.11/dist-packages (from geoviews) (2.0.2)
Requirement already satisfied: packaging in /usr/local/lib/python3.11/dist-packages (from geoviews) (24.2)
Requirement already satisfied: panel>=1.0.0 in /usr/local/lib/python3.11/dist-packages (from geoviews) (1.6.3)
Requirement already satisfied: param<3.0,>=1.9.3 in /usr/local/lib/python3.11/dist-packages (from geoviews) (2.2.0)
Requirement already satisfied: pyproj in /usr/local/lib/python3.11/dist-packages (from geoviews) (3.7.1)
Requirement already satisfied: shapely in /usr/local/lib/python3.11/dist-packages (from geoviews) (2.1.0)
Requirement already satisfied: xyzservices in /usr/local/lib/python3.11/dist-packages (from geoviews) (2025.4.0)
Requirement already satisfied: Jinja2>=2.9 in /usr/local/lib/python3.11/dist-packages (from bokeh>=3.6.0->geoviews) (3.1.6)
Requirement already satisfied: contourpy>=1.2 in /usr/local/lib/python3.11/dist-packages (from bokeh>=3.6.0->geoviews) (1.3.2)
Requirement already satisfied: narwhals>=1.13 in /usr/local/lib/python3.11/dist-packages (from bokeh>=3.6.0->geoviews) (1.39.0)
Requirement already satisfied: pandas>=1.2 in /usr/local/lib/python3.11/dist-packages (from bokeh>=3.6.0->geoviews) (2.2.2)
Requirement already satisfied: pillow>=7.1.0 in /usr/local/lib/python3.11/dist-packages (from bokeh>=3.6.0->geoviews) (11.2.1)
Requirement already satisfied: PyYAML>=3.10 in /usr/local/lib/python3.11/dist-packages (from bokeh>=3.6.0->geoviews) (6.0.2)
Requirement already satisfied: tornado>=6.2 in /usr/local/lib/python3.11/dist-packages (from bokeh>=3.6.0->geoviews) (6.4.2)
Requirement already satisfied: matplotlib>=3.6 in /usr/local/lib/python3.11/dist-packages (from cartopy>=0.18.0->geoviews) (3.10.0)
Requirement already satisfied: pyshp>=2.3 in /usr/local/lib/python3.11/dist-packages (from cartopy>=0.18.0->geoviews) (2.3.1)
Requirement already satisfied: colorcet in /usr/local/lib/python3.11/dist-packages (from holoviews>=1.16.0->geoviews) (3.1.0)
Requirement already satisfied: pyviz-comms>=2.1 in /usr/local/lib/python3.11/dist-packages (from holoviews>=1.16.0->geoviews) (3.0.4)
Requirement already satisfied: bleach in /usr/local/lib/python3.11/dist-packages (from panel>=1.0.0->geoviews) (6.2.0)
Requirement already satisfied: linkify-it-py in /usr/local/lib/python3.11/dist-packages (from panel>=1.0.0->geoviews) (2.0.3)
Requirement already satisfied: markdown in /usr/local/lib/python3.11/dist-packages (from panel>=1.0.0->geoviews) (3.8)
Requirement already satisfied: markdown-it-py in /usr/local/lib/python3.11/dist-packages (from panel>=1.0.0->geoviews) (3.0.0)
Requirement already satisfied: mdit-py-plugins in /usr/local/lib/python3.11/dist-packages (from panel>=1.0.0->geoviews) (0.4.2)
Requirement already satisfied: requests in /usr/local/lib/python3.11/dist-packages (from panel>=1.0.0->geoviews) (2.32.3)
Requirement already satisfied: tqdm in /usr/local/lib/python3.11/dist-packages (from panel>=1.0.0->geoviews) (4.67.1)
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.11/dist-packages (from panel>=1.0.0->geoviews) (4.13.2)
Requirement already satisfied: certifi in /usr/local/lib/python3.11/dist-packages (from pyproj->geoviews) (2025.4.26)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.11/dist-packages (from Jinja2>=2.9->bokeh>=3.6.0->geoviews) (3.0.2)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.11/dist-packages (from matplotlib>=3.6->cartopy>=0.18.0->geoviews) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.11/dist-packages (from matplotlib>=3.6->cartopy>=0.18.0->geoviews) (4.58.0)
Requirement already satisfied: kiwisolver>=1.3.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib>=3.6->cartopy>=0.18.0->geoviews) (1.4.8)
Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib>=3.6->cartopy>=0.18.0->geoviews) (3.2.3)
Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.11/dist-packages (from matplotlib>=3.6->cartopy>=0.18.0->geoviews) (2.9.0.post0)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.11/dist-packages (from pandas>=1.2->bokeh>=3.6.0->geoviews) (2025.2)
Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.11/dist-packages (from pandas>=1.2->bokeh>=3.6.0->geoviews) (2025.2)
Requirement already satisfied: webencodings in /usr/local/lib/python3.11/dist-packages (from bleach->panel>=1.0.0->geoviews) (0.5.1)
Requirement already satisfied: uc-micro-py in /usr/local/lib/python3.11/dist-packages (from linkify-it-py->panel>=1.0.0->geoviews) (1.0.3)
Requirement already satisfied: mdurl~=0.1 in /usr/local/lib/python3.11/dist-packages (from markdown-it-py->panel>=1.0.0->geoviews) (0.1.2)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.11/dist-packages (from requests->panel>=1.0.0->geoviews) (3.4.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.11/dist-packages (from requests->panel>=1.0.0->geoviews) (3.10)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.11/dist-packages (from requests->panel>=1.0.0->geoviews) (2.4.0)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.11/dist-packages (from python-dateutil>=2.7->matplotlib>=3.6->cartopy>=0.18.0->geoviews) (1.17.0)
Downloading geoviews-1.14.0-py3-none-any.whl (547 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 547.3/547.3 kB 7.3 MB/s eta 0:00:00
Downloading Cartopy-0.24.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.7 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.7/11.7 MB 67.8 MB/s eta 0:00:00
Installing collected packages: cartopy, geoviews
Successfully installed cartopy-0.24.1 geoviews-1.14.0
Requirement already satisfied: cartopy in /usr/local/lib/python3.11/dist-packages (0.24.1)
Requirement already satisfied: numpy>=1.23 in /usr/local/lib/python3.11/dist-packages (from cartopy) (2.0.2)
Requirement already satisfied: matplotlib>=3.6 in /usr/local/lib/python3.11/dist-packages (from cartopy) (3.10.0)
Requirement already satisfied: shapely>=1.8 in /usr/local/lib/python3.11/dist-packages (from cartopy) (2.1.0)
Requirement already satisfied: packaging>=21 in /usr/local/lib/python3.11/dist-packages (from cartopy) (24.2)
Requirement already satisfied: pyshp>=2.3 in /usr/local/lib/python3.11/dist-packages (from cartopy) (2.3.1)
Requirement already satisfied: pyproj>=3.3.1 in /usr/local/lib/python3.11/dist-packages (from cartopy) (3.7.1)
Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib>=3.6->cartopy) (1.3.2)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.11/dist-packages (from matplotlib>=3.6->cartopy) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.11/dist-packages (from matplotlib>=3.6->cartopy) (4.58.0)
Requirement already satisfied: kiwisolver>=1.3.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib>=3.6->cartopy) (1.4.8)
Requirement already satisfied: pillow>=8 in /usr/local/lib/python3.11/dist-packages (from matplotlib>=3.6->cartopy) (11.2.1)
Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib>=3.6->cartopy) (3.2.3)
Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.11/dist-packages (from matplotlib>=3.6->cartopy) (2.9.0.post0)
Requirement already satisfied: certifi in /usr/local/lib/python3.11/dist-packages (from pyproj>=3.3.1->cartopy) (2025.4.26)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.11/dist-packages (from python-dateutil>=2.7->matplotlib>=3.6->cartopy) (1.17.0)
Collecting basemap
  Downloading basemap-1.4.1-cp311-cp311-manylinux1_x86_64.whl.metadata (9.1 kB)
Collecting basemap-data<1.4,>=1.3.2 (from basemap)
  Downloading basemap_data-1.3.2-py2.py3-none-any.whl.metadata (2.7 kB)
Requirement already satisfied: pyshp<2.4,>=1.2 in /usr/local/lib/python3.11/dist-packages (from basemap) (2.3.1)
Collecting matplotlib<3.9,>=1.5 (from basemap)
  Downloading matplotlib-3.8.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (5.8 kB)
Collecting pyproj<3.7.0,>=1.9.3 (from basemap)
  Downloading pyproj-3.6.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (31 kB)
Collecting packaging<24.0,>=16.0 (from basemap)
  Downloading packaging-23.2-py3-none-any.whl.metadata (3.2 kB)
Collecting numpy<1.27,>=1.21 (from basemap)
  Downloading numpy-1.26.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (61 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 61.0/61.0 kB 2.6 MB/s eta 0:00:00
Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib<3.9,>=1.5->basemap) (1.3.2)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.11/dist-packages (from matplotlib<3.9,>=1.5->basemap) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.11/dist-packages (from matplotlib<3.9,>=1.5->basemap) (4.58.0)
Requirement already satisfied: kiwisolver>=1.3.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib<3.9,>=1.5->basemap) (1.4.8)
Requirement already satisfied: pillow>=8 in /usr/local/lib/python3.11/dist-packages (from matplotlib<3.9,>=1.5->basemap) (11.2.1)
Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib<3.9,>=1.5->basemap) (3.2.3)
Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.11/dist-packages (from matplotlib<3.9,>=1.5->basemap) (2.9.0.post0)
Requirement already satisfied: certifi in /usr/local/lib/python3.11/dist-packages (from pyproj<3.7.0,>=1.9.3->basemap) (2025.4.26)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.11/dist-packages (from python-dateutil>=2.7->matplotlib<3.9,>=1.5->basemap) (1.17.0)
Downloading basemap-1.4.1-cp311-cp311-manylinux1_x86_64.whl (942 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 942.4/942.4 kB 17.8 MB/s eta 0:00:00
Downloading basemap_data-1.3.2-py2.py3-none-any.whl (30.5 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 30.5/30.5 MB 60.1 MB/s eta 0:00:00
Downloading matplotlib-3.8.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.6 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.6/11.6 MB 98.8 MB/s eta 0:00:00
Downloading numpy-1.26.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.3 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.3/18.3 MB 86.3 MB/s eta 0:00:00
Downloading packaging-23.2-py3-none-any.whl (53 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 53.0/53.0 kB 3.6 MB/s eta 0:00:00
Downloading pyproj-3.6.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (8.6 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.6/8.6 MB 62.6 MB/s eta 0:00:00
Installing collected packages: pyproj, packaging, numpy, basemap-data, matplotlib, basemap
  Attempting uninstall: pyproj
    Found existing installation: pyproj 3.7.1
    Uninstalling pyproj-3.7.1:
      Successfully uninstalled pyproj-3.7.1
  Attempting uninstall: packaging
    Found existing installation: packaging 24.2
    Uninstalling packaging-24.2:
      Successfully uninstalled packaging-24.2
  Attempting uninstall: numpy
    Found existing installation: numpy 2.0.2
    Uninstalling numpy-2.0.2:
      Successfully uninstalled numpy-2.0.2
  Attempting uninstall: matplotlib
    Found existing installation: matplotlib 3.10.0
    Uninstalling matplotlib-3.10.0:
      Successfully uninstalled matplotlib-3.10.0
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-cloud-bigquery 3.32.0 requires packaging>=24.2.0, but you have packaging 23.2 which is incompatible.
thinc 8.3.6 requires numpy<3.0.0,>=2.0.0, but you have numpy 1.26.4 which is incompatible.
db-dtypes 1.4.3 requires packaging>=24.2.0, but you have packaging 23.2 which is incompatible.
Successfully installed basemap-1.4.1 basemap-data-1.3.2 matplotlib-3.8.4 numpy-1.26.4 packaging-23.2 pyproj-3.6.1

Importing necessary libraries & Loading the COVID-19 DatasetΒΆ

InΒ [1]:
import pandas as pd
details= pd.read_csv('covid_19_data.csv', parse_dates=['Date'])
details.sample(6)
Out[1]:
Province/State Country/Region Lat Long Date Confirmed Deaths Recovered Active WHO Region
2955 NaN Costa Rica 9.748900 -83.753400 2020-02-02 0 0 0 0 Americas
18940 NaN Lithuania 55.169400 23.881300 2020-04-03 696 9 7 680 Europe
37022 NaN United Kingdom 55.378100 -3.436000 2020-06-11 268657 41377 0 227280 Europe
11241 NaN Bahamas 25.025885 -78.035889 2020-03-05 0 0 0 0 Americas
3501 Mayotte France -12.827500 45.166244 2020-02-04 0 0 0 0 Europe
23442 NaN Ukraine 48.379400 31.165600 2020-04-20 5710 151 359 5200 Europe
InΒ [Β ]:
details.describe()
Out[Β ]:
Lat Long Date Confirmed Deaths Recovered Active
count 49068.000000 49068.000000 49068 4.906800e+04 49068.000000 4.906800e+04 4.906800e+04
mean 21.433730 23.528236 2020-04-24 12:00:00 1.688490e+04 884.179160 7.915713e+03 8.085012e+03
min -51.796300 -135.000000 2020-01-22 00:00:00 0.000000e+00 0.000000 0.000000e+00 -1.400000e+01
25% 7.873054 -15.310100 2020-03-08 18:00:00 4.000000e+00 0.000000 0.000000e+00 0.000000e+00
50% 23.634500 21.745300 2020-04-24 12:00:00 1.680000e+02 2.000000 2.900000e+01 2.600000e+01
75% 41.204380 80.771797 2020-06-10 06:00:00 1.518250e+03 30.000000 6.660000e+02 6.060000e+02
max 71.706900 178.065000 2020-07-27 00:00:00 4.290259e+06 148011.000000 1.846641e+06 2.816444e+06
std 24.950320 70.442740 NaN 1.273002e+05 6313.584411 5.480092e+04 7.625890e+04
InΒ [Β ]:
details.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 49068 entries, 0 to 49067
Data columns (total 10 columns):
 #   Column          Non-Null Count  Dtype         
---  ------          --------------  -----         
 0   Province/State  14664 non-null  object        
 1   Country/Region  49068 non-null  object        
 2   Lat             49068 non-null  float64       
 3   Long            49068 non-null  float64       
 4   Date            49068 non-null  datetime64[ns]
 5   Confirmed       49068 non-null  int64         
 6   Deaths          49068 non-null  int64         
 7   Recovered       49068 non-null  int64         
 8   Active          49068 non-null  int64         
 9   WHO Region      49068 non-null  object        
dtypes: datetime64[ns](1), float64(2), int64(4), object(3)
memory usage: 3.7+ MB
InΒ [Β ]:
details.isna().sum()
Out[Β ]:
0
Province/State 34404
Country/Region 0
Lat 0
Long 0
Date 0
Confirmed 0
Deaths 0
Recovered 0
Active 0
WHO Region 0

InΒ [Β ]:
print(details.describe(include=['object']))
                      Province/State Country/Region WHO Region
count                          14664          49068      49068
unique                            78            187          6
top     Australian Capital Territory          China     Europe
freq                             188           6204      15040

Converting 'Date' column to datetime format & Extracting Year and MonthΒΆ

InΒ [Β ]:
# Converting 'Date' column to datetime format
details['Date'] = pd.to_datetime(details['Date'])

# Extracting Year and Month
details['Year'] = details['Date'].dt.year
details['Month'] = details['Date'].dt.month
InΒ [Β ]:
details.head()
Out[Β ]:
Province/State Country/Region Lat Long Date Confirmed Deaths Recovered Active WHO Region Year Month
0 NaN Afghanistan 33.93911 67.709953 2020-01-22 0 0 0 0 Eastern Mediterranean 2020 1
1 NaN Albania 41.15330 20.168300 2020-01-22 0 0 0 0 Europe 2020 1
2 NaN Algeria 28.03390 1.659600 2020-01-22 0 0 0 0 Africa 2020 1
3 NaN Andorra 42.50630 1.521800 2020-01-22 0 0 0 0 Europe 2020 1
4 NaN Angola -11.20270 17.873900 2020-01-22 0 0 0 0 Africa 2020 1

Adding new ColumnsΒΆ

InΒ [Β ]:
# Creating an "Active Ratio" column
details["Active Ratio"] = details["Active"] / details["Confirmed"]

# Creating a "Fatality Rate" column
details["Fatality Rate"] = details["Deaths"] / details["Confirmed"]
InΒ [Β ]:
details.head()
Out[Β ]:
Province/State Country/Region Lat Long Date Confirmed Deaths Recovered Active WHO Region Year Month Active Ratio Fatality Rate
0 NaN Afghanistan 33.93911 67.709953 2020-01-22 0 0 0 0 Eastern Mediterranean 2020 1 NaN NaN
1 NaN Albania 41.15330 20.168300 2020-01-22 0 0 0 0 Europe 2020 1 NaN NaN
2 NaN Algeria 28.03390 1.659600 2020-01-22 0 0 0 0 Africa 2020 1 NaN NaN
3 NaN Andorra 42.50630 1.521800 2020-01-22 0 0 0 0 Europe 2020 1 NaN NaN
4 NaN Angola -11.20270 17.873900 2020-01-22 0 0 0 0 Africa 2020 1 NaN NaN
InΒ [Β ]:
details.shape
Out[Β ]:
(49068, 14)
InΒ [Β ]:
print(details.describe())
                Lat          Long                 Date     Confirmed  \
count  49068.000000  49068.000000                49068  4.906800e+04   
mean      21.433730     23.528236  2020-04-24 12:00:00  1.688490e+04   
min      -51.796300   -135.000000  2020-01-22 00:00:00  0.000000e+00   
25%        7.873054    -15.310100  2020-03-08 18:00:00  4.000000e+00   
50%       23.634500     21.745300  2020-04-24 12:00:00  1.680000e+02   
75%       41.204380     80.771797  2020-06-10 06:00:00  1.518250e+03   
max       71.706900    178.065000  2020-07-27 00:00:00  4.290259e+06   
std       24.950320     70.442740                  NaN  1.273002e+05   

              Deaths     Recovered        Active     Year         Month  \
count   49068.000000  4.906800e+04  4.906800e+04  49068.0  49068.000000   
mean      884.179160  7.915713e+03  8.085012e+03   2020.0      4.281915   
min         0.000000  0.000000e+00 -1.400000e+01   2020.0      1.000000   
25%         0.000000  0.000000e+00  0.000000e+00   2020.0      3.000000   
50%         2.000000  2.900000e+01  2.600000e+01   2020.0      4.000000   
75%        30.000000  6.660000e+02  6.060000e+02   2020.0      6.000000   
max    148011.000000  1.846641e+06  2.816444e+06   2020.0      7.000000   
std      6313.584411  5.480092e+04  7.625890e+04      0.0      1.810241   

       Active Ratio  Fatality Rate  
count  39009.000000   39009.000000  
mean       0.496699       0.027994  
min       -0.035714       0.000000  
25%        0.095668       0.000000  
50%        0.501558       0.013699  
75%        0.877551       0.038560  
max        1.000000       1.000000  
std        0.375067       0.043864  

Count of cases by countryΒΆ

InΒ [Β ]:
print(details['Country/Region'].value_counts().head(10))
Country/Region
China                  6204
Canada                 2256
France                 2068
United Kingdom         2068
Australia              1504
Netherlands             752
Denmark                 376
Algeria                 188
Andorra                 188
Antigua and Barbuda     188
Name: count, dtype: int64

Display of Top 10 Countries based on total Active case countΒΆ

InΒ [Β ]:
# Count total active cases per country
active_cases_by_country = details.groupby('Country/Region', as_index=False)['Active'].sum()

# Sort by the highest active cases
active_cases_by_country = active_cases_by_country.sort_values(by='Active', ascending=False)

# Display the top 10 countries with the highest active cases without index
print(active_cases_by_country.head(10).to_string(index=False))
Country/Region    Active
            US 156981121
        Brazil  31094060
United Kingdom  22624595
        Russia  19668578
         India  15987913
        France  10980287
         Spain   9277432
        Canada   8656985
          Peru   7748957
         Italy   7363518

Display of Top 10 Countries based on total Death case countΒΆ

InΒ [Β ]:
# Count total death cases per country
active_cases_by_country = details.groupby('Country/Region', as_index=False)['Deaths'].sum()

# Sort by the highest death cases
active_cases_by_country = active_cases_by_country.sort_values(by='Deaths', ascending=False)

# Display the top 10 countries with the highest death cases without index
print(active_cases_by_country.head(10).to_string(index=False))
Country/Region   Deaths
            US 11011411
United Kingdom  3997775
        Brazil  3938034
         Italy  3707717
        France  3048524
         Spain  3033030
        Mexico  1728277
         India  1111831
          Iran  1024136
       Belgium   963679

Extraxtion of Year from Date and group them by CountryΒΆ

InΒ [Β ]:
# Ensure 'Date' column is in datetime format
details['Date'] = pd.to_datetime(details['Date'])

# Extract 'Year' from 'Date'
details['Year'] = details['Date'].dt.year

# Group by 'Year' and 'Country/Region' and sum up the cases
yearly_country_summary = details.groupby(['Year', 'Country/Region'], as_index=False)[['Confirmed', 'Deaths', 'Recovered', 'Active']].sum()

# Compute Fatality Rate (Deaths per Confirmed cases) and Active Ratio (Active per Confirmed cases)
yearly_country_summary['Fatality Rate'] = (yearly_country_summary['Deaths'] / yearly_country_summary['Confirmed']) * 100
yearly_country_summary['Active Ratio'] = (yearly_country_summary['Active'] / yearly_country_summary['Confirmed']) * 100

# Display top 10 records without index
print(yearly_country_summary.head(10).to_string(index=False))
 Year      Country/Region  Confirmed  Deaths  Recovered  Active  Fatality Rate  Active Ratio
 2020         Afghanistan    1936390   49098     798240 1089052       2.535543     56.241356
 2020             Albania     196702    5708     118877   72117       2.901852     36.663074
 2020             Algeria    1179755   77972     755897  345886       6.609169     29.318460
 2020             Andorra      94404    5423      69074   19907       5.744460     21.087030
 2020              Angola      22662    1078       6573   15011       4.756862     66.238637
 2020 Antigua and Barbuda       4487     326       2600    1561       7.265433     34.789392
 2020           Argentina    4450658   97749    1680024 2672885       2.196282     60.055951
 2020             Armenia    1587173   27089     857482  702602       1.706745     44.267512
 2020           Australia     960247   11387     711928  236932       1.185841     24.674068
 2020             Austria    2034986   71390    1638380  325216       3.508132     15.981240

Cases DistributionΒΆ

  • Based on statusΒΆ

InΒ [Β ]:
# Import the necessary library
import matplotlib.pyplot as plt

# Aggregate total numbers across all records
total_confirmed = details['Confirmed'].sum()
total_deaths = details['Deaths'].sum()
total_recovered = details['Recovered'].sum()
total_active = details['Active'].sum()

# Data for pie chart
labels = ['Active Cases', 'Recovered Cases', 'Deaths']
sizes = [total_active, total_recovered, total_deaths]
colors = ['#ffcc00', '#33cc33', '#ff3300']
explode = (0.1, 0, 0)  # Explode the "Active Cases" section for emphasis

# Create Pie Chart
plt.figure(figsize=(8, 6))
wedges, texts, autotexts = plt.pie(
    sizes, labels=labels, autopct='%1.1f%%', colors=colors,
    explode=explode, shadow=True, startangle=140
)

# Add legend
plt.legend(wedges, labels, title="Case Types", loc="upper right", bbox_to_anchor=(1.2, 1))

# Add Title
plt.title('COVID-19 Cases Distribution')

# Show Plot
plt.show()
No description has been provided for this image
  • Active cases in Top 10 Countries
InΒ [Β ]:
# Group data by 'Countries' and sum the 'Active' cases
top_10_countries = details.groupby("Country/Region")["Active"].sum().nlargest(10)

# Data for the pie chart
labels = top_10_countries.index
sizes = top_10_countries.values
colors = plt.cm.tab20.colors
explode = [0.1 if i == 0 else 0 for i in range(10)]  # Highlight top state

# Create Pie Chart without text labels
plt.figure(figsize=(8, 6))
wedges, _ = plt.pie(
    sizes, colors=colors, explode=explode,
    shadow=True, startangle=140, wedgeprops={'width': 0.4}  # Corrected quote here
)

# Add legend with state names and percentages
percentages = [f"{label}: {size / sum(sizes) * 100:.1f}%" for label, size in zip(labels, sizes)]
plt.legend(wedges, percentages, title="Top 10 Countries", loc="center left", bbox_to_anchor=(1, 0.5))

# Add Title
plt.title('Top 10 States with Most Active Cases')

# Show Plot
plt.show()
No description has been provided for this image
  • Death cases in Top 10 countries
InΒ [Β ]:
# Group data by 'Province/State' and sum the 'Deaths' cases
top_10_states = details.groupby("Province/State")["Deaths"].sum().nlargest(10)

# Data for the pie chart
labels = top_10_states.index
sizes = top_10_states.values
colors = plt.cm.tab20.colors
explode = [0.1 if i == 0 else 0 for i in range(10)]  # Highlight top state

# Create Pie Chart without text labels
plt.figure(figsize=(8, 6))
wedges, _ = plt.pie(
    sizes, colors=colors, explode=explode,
    shadow=True, startangle=140, wedgeprops={'width': 0.4}  # Corrected quote here
)

# Add legend with state names and percentages
percentages = [f"{label}: {size / sum(sizes) * 100:.1f}%" for label, size in zip(labels, sizes)]
plt.legend(wedges, percentages, title="Top 10 States", loc="center left", bbox_to_anchor=(1, 0.5))

# Add Title
plt.title('Top 10 States with Most Deaths Cases')

# Show Plot
plt.show()
No description has been provided for this image
  • Top 10 Recovery Cases Countries
InΒ [Β ]:
# Group data by 'Province/State' and sum the 'Recovered' cases
top_10_states = details.groupby("Province/State")["Recovered"].sum().nlargest(10)

# Data for the pie chart
labels = top_10_states.index
sizes = top_10_states.values
colors = plt.cm.tab20.colors
explode = [0.1 if i == 0 else 0 for i in range(10)]  # Highlight top state

# Create Pie Chart without text labels
plt.figure(figsize=(8, 6))
wedges, _ = plt.pie(
    sizes, colors=colors, explode=explode,
    shadow=True, startangle=140, wedgeprops={'width': 0.4}  # Corrected quote here
)

# Add legend with state names and percentages
percentages = [f"{label}: {size / sum(sizes) * 100:.1f}%" for label, size in zip(labels, sizes)]
plt.legend(wedges, percentages, title="Top 10 States", loc="center left", bbox_to_anchor=(1, 0.5))

# Add Title
plt.title('Top 10 States with Most Recovered Cases')
# Show Plot
plt.show()
No description has been provided for this image

Graph of worstly effected Top 5 countriesΒΆ

InΒ [Β ]:
# Importing necessary libraries
import seaborn as sns

# Identify top 5 states with the most active cases
top_5_states = details.groupby("Province/State")["Active"].sum().nlargest(5).index

# Filter dataset for these top 5 states
filtered_data = details[details['Province/State'].isin(top_5_states)]

# Line plot for the trend of active cases
plt.figure(figsize=(12, 6))
sns.lineplot(
    data=filtered_data,
    x='Date',
    y='Active',
    hue='Province/State',
    size='Province/State',
    sizes=(1, 2.5)  # Controls line thickness
)

# Title and labels
plt.title('Top 5 Most Affected States - Active Cases Trend', size=20)
plt.xlabel('Date')
plt.ylabel('Active Cases')
plt.grid(True)  # Add grid for better readability
plt.legend(title="States", loc='upper left')

# Show plot
plt.show()
No description has been provided for this image

Fatality Rate in different CountriesΒΆ

InΒ [Β ]:
# Calculate Fatality Ratio in your dataset
details['Fatality_Ratio'] = details['Deaths'] / details['Confirmed']

# Create the plot
plt.figure(figsize=(12, 6))
sns.pointplot(
    data=details,
    x='Province/State',
    y='Fatality_Ratio',
    color='green'
)

# Enhancing the visualization
plt.xticks(rotation=90)  # Rotate state names for better visibility
plt.title('Fatality Ratio of Contaminated States', size=20)
plt.xlabel('States/Provinces')
plt.ylabel('Fatality Ratio')
plt.grid(True)

# Show plot
plt.show()
No description has been provided for this image

Compairing Worstly Affected CountriesΒΆ

InΒ [Β ]:
# Select top 5 affected provinces for comparison
top_5_provinces = details.groupby("Province/State")["Confirmed"].sum().nlargest(5).index

# Extract data for these provinces
dates = details['Date'].unique()  # Extract unique dates
global_confirmed = []

# Collect confirmed case data for each province
for province in top_5_provinces:
    province_data = details[details['Province/State'] == province].groupby('Date')['Confirmed'].sum()
    global_confirmed.append(province_data.values)

# Plotting
plt.figure(figsize=(12, 6))
plt.xticks(rotation=90, fontsize=11)
plt.yticks(fontsize=10)
plt.xlabel("Dates", fontsize=20)
plt.ylabel("Total Confirmed Cases", fontsize=20)
plt.title("COVID-19 Comparison Among Top 5 Provinces", fontsize=20)

# Plot data for each province
for i in range(len(top_5_provinces)):
    plt.plot(dates, global_confirmed[i], label=top_5_provinces[i], linestyle='-')

plt.legend(title="Provinces/States", fontsize=12)
plt.grid(True)
plt.show()
No description has been provided for this image

Countries with most active cases

InΒ [Β ]:
# Importing necessary libraries
import matplotlib.dates as mdates

# Filter data after a specific date for recent cases
latest = details[details['Date'] > '2020-03-24']

# Calculate 'Active' cases
latest['Active'] = latest['Confirmed'] - (latest['Deaths'] + latest['Recovered'])

# Group and sort by confirmed cases
state_cases = latest.groupby('Province/State')[['Confirmed', 'Deaths', 'Recovered']].max().reset_index()
state_cases = state_cases.sort_values('Confirmed', ascending=False).fillna(0)
states = list(state_cases['Province/State'][:15])  # Top 15 states

# Dictionary to store cases data
states_confirmed = {}
states_dates = {}

# Extract state-wise data
for state in states:
    df = latest[latest['Province/State'] == state].reset_index()
    confirmed_diff = df['Confirmed'].diff().fillna(0).tolist()  # Daily change in confirmed cases
    states_confirmed[state] = confirmed_diff
    states_dates[state] = list(df['Date'])

# Moving Average Calculation Function
def calc_movingaverage(values, N):
    cumsum, moving_aves = [0], [0, 0]
    for i, x in enumerate(values, 1):
        cumsum.append(cumsum[i - 1] + x)
        if i >= N:
            moving_ave = (cumsum[i] - cumsum[i - N]) / N
            moving_aves.append(moving_ave)
    return moving_aves

# Plotting
fig = plt.figure(figsize=(25, 17))
plt.suptitle('5-Day Moving Average of Confirmed Cases in Top 15 Provinces', fontsize=20, y=1.0)

k = 0
for i in range(1, 16):  # For 15 states
    ax = fig.add_subplot(5, 3, i)
    ax.xaxis.set_major_formatter(mdates.DateFormatter('%d-%b'))
    ax.bar(states_dates[states[k]], states_confirmed[states[k]], label='Day-wise Confirmed Cases', color='blue')

    # Moving Average Plot
    moving_aves = calc_movingaverage(states_confirmed[states[k]], 5)
    ax.plot(states_dates[states[k]][:-2], moving_aves, color='red', label='Moving Average', linewidth=3)

    plt.title(states[k], fontsize=20)
    handles, labels = ax.get_legend_handles_labels()
    fig.legend(handles, labels, loc='upper left')

    k += 1

plt.tight_layout(pad=3.0)
plt.show()
<ipython-input-100-d4445580a496>:8: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

No description has been provided for this image

Data visualizationΒΆ

  • Confirmed Cases Over Time
InΒ [Β ]:
plt.figure(figsize=(12, 6))
sns.lineplot(data=details, x='Date', y='Confirmed', ci=None)
plt.title("Confirmed Cases Over Time")
plt.xlabel("Date")
plt.ylabel("Confirmed Cases")
plt.xticks(rotation=45)
plt.grid()
plt.show()
<ipython-input-101-9557d8128f35>:2: FutureWarning:



The `ci` parameter is deprecated. Use `errorbar=None` for the same effect.


No description has been provided for this image
  • Heatmap for Correlation Analysis
InΒ [Β ]:
plt.figure(figsize=(10, 6))
sns.heatmap(details[['Confirmed', 'Deaths', 'Recovered', 'Active', 'Fatality Rate']].corr(), annot=True, cmap="coolwarm", fmt=".2f")
plt.title("Feature Correlation Heatmap")
plt.show()
No description has been provided for this image
  • Top 10 affected countries
InΒ [Β ]:
top_countries = details.groupby('Country/Region')['Confirmed'].sum().sort_values(ascending=False).head(10)
plt.figure(figsize=(10, 5))
sns.barplot(y=top_countries.index, x=top_countries.values, palette="Reds")
plt.title("Top 10 Affected Countries")
plt.xlabel("Total Confirmed Cases")
plt.ylabel("Country")
plt.show()
<ipython-input-103-36614b7f89ed>:3: FutureWarning:



Passing `palette` without assigning `hue` is deprecated and will be removed in v0.14.0. Assign the `y` variable to `hue` and set `legend=False` for the same effect.


No description has been provided for this image
  • Box Plot for Active Ratio & Fatality Rate
InΒ [Β ]:
# Data visualization
# Box Plot for Confirmed Cases
plt.figure(figsize=(8, 3))
sns.boxplot(x=details['Confirmed'], color='skyblue')
plt.title('Distribution of Confirmed Cases')
plt.xlabel('Confirmed Cases')
plt.grid(True)
plt.show()

# Box Plot for Active Ratio
plt.figure(figsize=(8, 3))
sns.boxplot(x=details['Active Ratio'], color='lightgreen')
plt.title('Distribution of Active Ratio')
plt.xlabel('Active Ratio')
plt.grid(True)
plt.show()

# Box Plot for Fatality Rate
plt.figure(figsize=(8, 3))
sns.boxplot(x=details['Fatality Rate'], color='salmon')
plt.title('Distribution of Fatality Rate')
plt.xlabel('Fatality Rate')
plt.grid(True)
plt.show()
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
  • Scatter Plot
InΒ [Β ]:
# Scatter Plot: Confirmed Cases vs Active Ratio
plt.figure(figsize=(8, 4))
plt.scatter(details['Confirmed'], details['Active Ratio'], alpha=0.5, c='blue', label='Active Ratio')
plt.title('Confirmed Cases vs Active Ratio')
plt.xlabel('Confirmed Cases')
plt.ylabel('Active Ratio')
plt.grid(True)
plt.legend()
plt.show()

# Scatter Plot: Confirmed Cases vs Fatality Rate
plt.figure(figsize=(8, 4))
plt.scatter(details['Confirmed'], details['Fatality Rate'], alpha=0.5, c='red', label='Fatality Rate')
plt.title('Confirmed Cases vs Fatality Rate')
plt.xlabel('Confirmed Cases')
plt.ylabel('Fatality Rate')
plt.grid(True)
plt.legend()
plt.show()

# Scatter Plot: Active Ratio vs Fatality Rate
plt.figure(figsize=(8, 4))
plt.scatter(details['Active Ratio'], details['Fatality Rate'], alpha=0.5, c='green', label='Active vs Fatality')
plt.title('Active Ratio vs Fatality Rate')
plt.xlabel('Active Ratio')
plt.ylabel('Fatality Rate')
plt.grid(True)
plt.legend()
plt.show()
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image

Handling Missing and Infinte ValuesΒΆ

InΒ [Β ]:
# Importing necessary libraries
from scipy.stats import pearsonr

# Correlation between Confirmed Cases and Deaths
corr, _ = pearsonr(details['Confirmed'], details['Deaths'])
print(f"Correlation between Confirmed Cases and Deaths: {corr:.2f}")
Correlation between Confirmed Cases and Deaths: 0.91
InΒ [Β ]:
details.isna().sum()
Out[Β ]:
0
Province/State 34404
Country/Region 0
Lat 0
Long 0
Date 0
Confirmed 0
Deaths 0
Recovered 0
Active 0
WHO Region 0
Year 0
Month 0
Active Ratio 10059
Fatality Rate 10059
Fatality_Ratio 10059

InΒ [Β ]:
cases = ['Confirmed', 'Deaths', 'Recovered', 'Active']
details['total_cases']=details['Confirmed'] + details['Deaths'] + details['Recovered']
# Active Case = confirmed - deaths - recovered
details['Active'] = details['Confirmed'] - details['Deaths'] - details['Recovered']
# replacing Mainland china with just China
details['Country/Region'] = details['Country/Region'].replace('Mainland China', 'China')

# filling missing values
details[['Province/State']] = details[['Province/State']].fillna('')
details[cases] = details[cases].fillna(0)

# fixing datatypes
details['Recovered'] = details['Recovered'].astype(int)

details.sample(6)
Out[Β ]:
Province/State Country/Region Lat Long Date Confirmed Deaths Recovered Active WHO Region Year Month Active Ratio Fatality Rate Fatality_Ratio total_cases
30766 Dominica 15.415000 -61.371000 2020-05-18 16 0 16 0 Americas 2020 5 0.000000 0.000000 0.000000 32
45546 Iraq 33.223191 43.679291 2020-07-14 81757 3345 50782 27630 Eastern Mediterranean 2020 7 0.337953 0.040914 0.040914 135884
641 Ghana 7.946500 -1.023200 2020-01-24 0 0 0 0 Africa 2020 1 NaN NaN NaN 0
3603 Tunisia 33.886917 9.537499 2020-02-04 0 0 0 0 Eastern Mediterranean 2020 2 NaN NaN NaN 0
9200 Jiangxi China 27.614000 115.722100 2020-02-26 934 1 719 214 Western Pacific 2020 2 0.229122 0.001071 0.001071 1654
4126 Turkey 38.963700 35.243300 2020-02-06 0 0 0 0 Europe 2020 2 NaN NaN NaN 0

Daywise Aggrigate casesΒΆ

InΒ [Β ]:
temp = details.groupby('Date')[['Confirmed', 'Deaths', 'Recovered', 'Active']].sum().reset_index()
temp = temp[temp['Date'] == max(temp['Date'])].reset_index(drop=True)
temp['Global Mortality'] = temp['Deaths'] / temp['Confirmed']
temp['Deaths per 100 Confirmed Cases'] = temp['Global Mortality'] * 100
temp.style.background_gradient(cmap='Pastel1')
Out[Β ]:
Β  Date Confirmed Deaths Recovered Active Global Mortality Deaths per 100 Confirmed Cases
0 2020-07-27 00:00:00 16480485 654036 9468087 6358362 0.039685 3.968548
InΒ [Β ]:
full_latest = details[details['Date'] == max(details['Date'])].reset_index()
full_latest_grouped = full_latest.groupby('Country/Region')[['Confirmed', 'Deaths', 'Recovered', 'Active']].sum().reset_index()

temp_f = full_latest_grouped.sort_values(by='Confirmed', ascending=False)
temp_f = temp_f[['Country/Region', 'Confirmed', 'Active', 'Deaths', 'Recovered']]
temp_f = temp_f.reset_index(drop=True)

temp_f.style.background_gradient(cmap="Blues", subset=['Confirmed', 'Active'])\
       .background_gradient(cmap="Greens", subset=['Recovered'])\
       .background_gradient(cmap="Reds", subset=['Deaths'])
Out[Β ]:
Β  Country/Region Confirmed Active Deaths Recovered
0 US 4290259 2816444 148011 1325804
1 Brazil 2442375 508116 87618 1846641
2 India 1480073 495499 33408 951166
3 Russia 816680 201097 13334 602249
4 South Africa 452529 170537 7067 274925
5 Mexico 395489 47657 44022 303810
6 Peru 389717 98752 18418 272547
7 Chile 347923 18782 9187 319954
8 United Kingdom 301708 254427 45844 1437
9 Iran 293606 22550 15912 255144
10 Pakistan 274289 27421 5842 241026
11 Spain 272421 93613 28432 150376
12 Saudi Arabia 268934 43238 2760 222936
13 Colombia 257101 117163 8777 131161
14 Italy 246286 12581 35112 198593
15 Turkey 227019 10920 5630 210469
16 Bangladesh 226225 97577 2965 125683
17 France 220352 108928 30212 81212
18 Germany 207112 7673 9125 190314
19 Argentina 167416 91782 3059 72575
20 Canada 116458 107514 8944 0
21 Iraq 112585 30983 4458 77144
22 Qatar 109597 3104 165 106328
23 Indonesia 100303 37292 4838 58173
24 Egypt 92482 52992 4652 34838
25 China 86783 3258 4656 78869
26 Kazakhstan 84648 29659 585 54404
27 Philippines 82040 53649 1945 26446
28 Ecuador 81161 40733 5532 34896
29 Sweden 79395 73695 5700 0
30 Oman 77058 19637 393 57028
31 Bolivia 71181 47056 2647 21478
32 Belarus 67251 6221 538 60492
33 Ukraine 67096 28258 1636 37202
34 Belgium 66428 39154 9822 17452
35 Kuwait 64379 8884 438 55057
36 Dominican Republic 64156 32869 1083 30204
37 Israel 63985 36378 474 27133
38 Panama 61442 25034 1322 35086
39 United Arab Emirates 59177 6322 345 52510
40 Netherlands 53413 47064 6160 189
41 Singapore 50838 5119 27 45692
42 Portugal 50299 13205 1719 35375
43 Romania 45902 17902 2206 25794
44 Guatemala 45309 11093 1761 32455
45 Poland 43402 8870 1676 32856
46 Nigeria 41180 22117 860 18203
47 Honduras 39741 33536 1166 5039
48 Bahrain 39482 3231 141 36110
49 Armenia 37390 10014 711 26665
50 Afghanistan 36263 9796 1269 25198
51 Switzerland 34477 1599 1978 30900
52 Ghana 33624 3655 168 29801
53 Kyrgyzstan 33296 10790 1301 21205
54 Japan 31142 8174 998 21970
55 Azerbaijan 30446 6781 423 23242
56 Algeria 27973 7973 1163 18837
57 Ireland 25892 764 1764 23364
58 Serbia 24141 23598 543 0
59 Moldova 23154 6252 748 16154
60 Uzbekistan 21209 9414 121 11674
61 Morocco 20887 4018 316 16553
62 Austria 20558 1599 713 18246
63 Nepal 18752 4950 48 13754
64 Kenya 17975 9857 285 7833
65 Cameroon 17110 2180 391 14539
66 Venezuela 15988 5883 146 9959
67 Costa Rica 15841 11902 115 3824
68 Cote d'Ivoire 15655 5198 96 10361
69 Czechia 15516 3715 373 11428
70 Australia 15303 5825 167 9311
71 El Salvador 15035 6849 408 7778
72 Ethiopia 14547 7933 228 6386
73 South Korea 14203 896 300 13007
74 Denmark 13761 543 613 12605
75 Sudan 11424 4765 720 5939
76 West Bank and Gaza 10621 6791 78 3752
77 Bulgaria 10621 4689 347 5585
78 Bosnia and Herzegovina 10498 5274 294 4930
79 North Macedonia 10213 4183 466 5564
80 Senegal 9764 3093 194 6477
81 Madagascar 9690 3339 91 6260
82 Norway 9132 125 255 8752
83 Malaysia 8904 179 124 8601
84 Congo (Kinshasa) 8844 2936 208 5700
85 Kosovo 7413 3201 185 4027
86 Finland 7398 149 329 6920
87 Haiti 7340 2817 158 4365
88 Tajikistan 7235 1147 60 6028
89 Gabon 7189 2458 49 4682
90 Guinea 7055 753 45 6257
91 Luxembourg 6321 1384 112 4825
92 Mauritania 6208 1399 156 4653
93 Djibouti 5059 24 58 4977
94 Croatia 4881 806 139 3936
95 Albania 4880 1991 144 2745
96 Central African Republic 4599 2994 59 1546
97 Zambia 4552 1597 140 2815
98 Paraguay 4548 1600 43 2905
99 Hungary 4448 523 596 3329
100 Greece 4227 2651 202 1374
101 Lebanon 3882 2122 51 1709
102 Malawi 3664 1920 99 1645
103 Nicaragua 3439 839 108 2492
104 Maldives 3369 807 15 2547
105 Thailand 3297 128 58 3111
106 Congo (Brazzaville) 3200 2317 54 829
107 Somalia 3196 1560 93 1543
108 Equatorial Guinea 3071 2178 51 842
109 Montenegro 2893 2039 45 809
110 Libya 2827 2186 64 577
111 Sri Lanka 2805 673 11 2121
112 Zimbabwe 2704 2126 36 542
113 Cuba 2532 94 87 2351
114 Mali 2513 476 124 1913
115 Cabo Verde 2328 756 22 1550
116 Eswatini 2316 1257 34 1025
117 South Sudan 2305 1084 46 1175
118 Slovakia 2181 537 28 1616
119 Slovenia 2087 238 116 1733
120 Estonia 2034 42 69 1923
121 Lithuania 2019 319 80 1620
122 Guinea-Bissau 1954 1125 26 803
123 Rwanda 1879 899 5 975
124 Iceland 1854 21 10 1823
125 Namibia 1843 1734 8 101
126 Sierra Leone 1783 400 66 1317
127 Benin 1770 699 35 1036
128 Mozambique 1701 1690 11 0
129 Yemen 1691 375 483 833
130 New Zealand 1557 21 22 1514
131 Suriname 1483 534 24 925
132 Tunisia 1455 248 50 1157
133 Latvia 1219 143 31 1045
134 Uruguay 1202 216 35 951
135 Jordan 1176 124 11 1041
136 Liberia 1167 449 72 646
137 Georgia 1137 199 16 922
138 Niger 1132 36 69 1027
139 Uganda 1128 140 2 986
140 Burkina Faso 1100 121 53 926
141 Cyprus 1060 189 19 852
142 Angola 950 667 41 242
143 Chad 922 37 75 810
144 Andorra 907 52 52 803
145 Togo 874 249 18 607
146 Sao Tome and Principe 865 117 14 734
147 Jamaica 853 129 10 714
148 Botswana 739 674 2 63
149 Malta 701 27 9 665
150 San Marino 699 0 42 657
151 Syria 674 634 40 0
152 Tanzania 509 305 21 183
153 Lesotho 505 365 12 128
154 Taiwan* 462 15 7 440
155 Vietnam 431 66 0 365
156 Guyana 389 188 20 181
157 Bahamas 382 280 11 91
158 Burundi 378 76 1 301
159 Comoros 354 19 7 328
160 Burma 350 52 6 292
161 Mauritius 344 2 10 332
162 Gambia 326 252 8 66
163 Mongolia 289 67 0 222
164 Eritrea 265 74 0 191
165 Cambodia 226 79 0 147
166 Trinidad and Tobago 148 12 8 128
167 Brunei 141 0 3 138
168 Monaco 116 8 4 104
169 Seychelles 114 75 0 39
170 Barbados 110 9 7 94
171 Bhutan 99 13 0 86
172 Liechtenstein 86 4 1 81
173 Antigua and Barbuda 86 18 3 65
174 Papua New Guinea 62 51 0 11
175 Saint Vincent and the Grenadines 52 13 0 39
176 Belize 48 20 2 26
177 Fiji 27 9 0 18
178 Timor-Leste 24 24 0 0
179 Saint Lucia 24 2 0 22
180 Grenada 23 0 0 23
181 Laos 20 1 0 19
182 Dominica 18 0 0 18
183 Saint Kitts and Nevis 17 2 0 15
184 Greenland 14 1 0 13
185 Holy See 12 0 0 12
186 Western Sahara 10 1 1 8
InΒ [Β ]:
temp_flg = temp_f[temp_f['Deaths']>0][['Country/Region', 'Deaths']]
temp_flg['Deaths / 100 Cases'] = round((temp_f['Deaths']/temp_f['Confirmed'])*100, 2)
temp_flg.sort_values('Deaths', ascending=False).reset_index(drop=True).style.background_gradient(cmap='Reds')
Out[Β ]:
Β  Country/Region Deaths Deaths / 100 Cases
0 US 148011 3.450000
1 Brazil 87618 3.590000
2 United Kingdom 45844 15.190000
3 Mexico 44022 11.130000
4 Italy 35112 14.260000
5 India 33408 2.260000
6 France 30212 13.710000
7 Spain 28432 10.440000
8 Peru 18418 4.730000
9 Iran 15912 5.420000
10 Russia 13334 1.630000
11 Belgium 9822 14.790000
12 Chile 9187 2.640000
13 Germany 9125 4.410000
14 Canada 8944 7.680000
15 Colombia 8777 3.410000
16 South Africa 7067 1.560000
17 Netherlands 6160 11.530000
18 Pakistan 5842 2.130000
19 Sweden 5700 7.180000
20 Turkey 5630 2.480000
21 Ecuador 5532 6.820000
22 Indonesia 4838 4.820000
23 China 4656 5.370000
24 Egypt 4652 5.030000
25 Iraq 4458 3.960000
26 Argentina 3059 1.830000
27 Bangladesh 2965 1.310000
28 Saudi Arabia 2760 1.030000
29 Bolivia 2647 3.720000
30 Romania 2206 4.810000
31 Switzerland 1978 5.740000
32 Philippines 1945 2.370000
33 Ireland 1764 6.810000
34 Guatemala 1761 3.890000
35 Portugal 1719 3.420000
36 Poland 1676 3.860000
37 Ukraine 1636 2.440000
38 Panama 1322 2.150000
39 Kyrgyzstan 1301 3.910000
40 Afghanistan 1269 3.500000
41 Honduras 1166 2.930000
42 Algeria 1163 4.160000
43 Dominican Republic 1083 1.690000
44 Japan 998 3.200000
45 Nigeria 860 2.090000
46 Moldova 748 3.230000
47 Sudan 720 6.300000
48 Austria 713 3.470000
49 Armenia 711 1.900000
50 Denmark 613 4.450000
51 Hungary 596 13.400000
52 Kazakhstan 585 0.690000
53 Serbia 543 2.250000
54 Belarus 538 0.800000
55 Yemen 483 28.560000
56 Israel 474 0.740000
57 North Macedonia 466 4.560000
58 Kuwait 438 0.680000
59 Azerbaijan 423 1.390000
60 El Salvador 408 2.710000
61 Oman 393 0.510000
62 Cameroon 391 2.290000
63 Czechia 373 2.400000
64 Bulgaria 347 3.270000
65 United Arab Emirates 345 0.580000
66 Finland 329 4.450000
67 Morocco 316 1.510000
68 South Korea 300 2.110000
69 Bosnia and Herzegovina 294 2.800000
70 Kenya 285 1.590000
71 Norway 255 2.790000
72 Ethiopia 228 1.570000
73 Congo (Kinshasa) 208 2.350000
74 Greece 202 4.780000
75 Senegal 194 1.990000
76 Kosovo 185 2.500000
77 Ghana 168 0.500000
78 Australia 167 1.090000
79 Qatar 165 0.150000
80 Haiti 158 2.150000
81 Mauritania 156 2.510000
82 Venezuela 146 0.910000
83 Albania 144 2.950000
84 Bahrain 141 0.360000
85 Zambia 140 3.080000
86 Croatia 139 2.850000
87 Mali 124 4.930000
88 Malaysia 124 1.390000
89 Uzbekistan 121 0.570000
90 Slovenia 116 5.560000
91 Costa Rica 115 0.730000
92 Luxembourg 112 1.770000
93 Nicaragua 108 3.140000
94 Malawi 99 2.700000
95 Cote d'Ivoire 96 0.610000
96 Somalia 93 2.910000
97 Madagascar 91 0.940000
98 Cuba 87 3.440000
99 Lithuania 80 3.960000
100 West Bank and Gaza 78 0.730000
101 Chad 75 8.130000
102 Liberia 72 6.170000
103 Estonia 69 3.390000
104 Niger 69 6.100000
105 Sierra Leone 66 3.700000
106 Libya 64 2.260000
107 Tajikistan 60 0.830000
108 Central African Republic 59 1.280000
109 Thailand 58 1.760000
110 Djibouti 58 1.150000
111 Congo (Brazzaville) 54 1.690000
112 Burkina Faso 53 4.820000
113 Andorra 52 5.730000
114 Equatorial Guinea 51 1.660000
115 Lebanon 51 1.310000
116 Tunisia 50 3.440000
117 Gabon 49 0.680000
118 Nepal 48 0.260000
119 South Sudan 46 2.000000
120 Montenegro 45 1.560000
121 Guinea 45 0.640000
122 Paraguay 43 0.950000
123 San Marino 42 6.010000
124 Angola 41 4.320000
125 Syria 40 5.930000
126 Zimbabwe 36 1.330000
127 Uruguay 35 2.910000
128 Benin 35 1.980000
129 Eswatini 34 1.470000
130 Latvia 31 2.540000
131 Slovakia 28 1.280000
132 Singapore 27 0.050000
133 Guinea-Bissau 26 1.330000
134 Suriname 24 1.620000
135 New Zealand 22 1.410000
136 Cabo Verde 22 0.950000
137 Tanzania 21 4.130000
138 Guyana 20 5.140000
139 Cyprus 19 1.790000
140 Togo 18 2.060000
141 Georgia 16 1.410000
142 Maldives 15 0.450000
143 Sao Tome and Principe 14 1.620000
144 Lesotho 12 2.380000
145 Sri Lanka 11 0.390000
146 Mozambique 11 0.650000
147 Bahamas 11 2.880000
148 Jordan 11 0.940000
149 Iceland 10 0.540000
150 Jamaica 10 1.170000
151 Mauritius 10 2.910000
152 Malta 9 1.280000
153 Namibia 8 0.430000
154 Trinidad and Tobago 8 5.410000
155 Gambia 8 2.450000
156 Barbados 7 6.360000
157 Comoros 7 1.980000
158 Taiwan* 7 1.520000
159 Burma 6 1.710000
160 Rwanda 5 0.270000
161 Monaco 4 3.450000
162 Antigua and Barbuda 3 3.490000
163 Brunei 3 2.130000
164 Botswana 2 0.270000
165 Uganda 2 0.180000
166 Belize 2 4.170000
167 Burundi 1 0.260000
168 Liechtenstein 1 1.160000
169 Western Sahara 1 10.000000
InΒ [Β ]:
import plotly.express as px
import numpy as np  # Since np is used, ensure NumPy is imported

# Choropleth map for confirmed cases
fig = px.choropleth(full_latest_grouped,
                    locations="Country/Region",
                    locationmode='country names',
                    color=np.log(full_latest_grouped["Confirmed"]),
                    hover_name="Country/Region",
                    hover_data=['Confirmed'],
                    color_continuous_scale="peach",
                    title='Countries with Confirmed Cases')

fig.update(layout_coloraxis_showscale=False)
fig.show()
InΒ [Β ]:
# Deaths
temp = full_latest_grouped[full_latest_grouped['Deaths']>0]
fig = px.choropleth(temp,
                    locations="Country/Region", locationmode='country names',
                    color=np.log(temp["Deaths"]), hover_name="Country/Region",
                    color_continuous_scale="Peach", hover_data=['Deaths'],
                    title='Countries with Deaths Reported')
fig.update(layout_coloraxis_showscale=False)
fig.show()
InΒ [Β ]:
formated_gdf = details.groupby(['Date', 'Country/Region'])[['Confirmed', 'Deaths']].max()
formated_gdf = formated_gdf.reset_index()
formated_gdf['Date'] = pd.to_datetime(formated_gdf['Date'])
formated_gdf['Date'] = formated_gdf['Date'].dt.strftime('%m/%d/%Y')
formated_gdf['size'] = formated_gdf['Confirmed'].pow(0.3)

fig = px.scatter_geo(formated_gdf, locations="Country/Region", locationmode='country names',
                     color="Confirmed", size='size', hover_name="Country/Region",
                     range_color= [0, max(formated_gdf['Confirmed'])+2], animation_frame="Date",
                     title='Spread over time')
fig.update(layout_coloraxis_showscale=False)
fig.show()
InΒ [Β ]:
formated_gdf = details.groupby(['Date', 'Country/Region'])[['Recovered', 'Deaths']].max()
formated_gdf = formated_gdf.reset_index()
formated_gdf['Date'] = pd.to_datetime(formated_gdf['Date'])
formated_gdf['Date'] = formated_gdf['Date'].dt.strftime('%m/%d/%Y')
formated_gdf['size'] = formated_gdf['Recovered'].pow(0.3)

fig = px.scatter_geo(formated_gdf, locations="Country/Region", locationmode='country names',
                     color="Recovered", size='size', hover_name="Country/Region",
                     range_color= [0, max(formated_gdf['Recovered'])+2], animation_frame="Date",
                     title='Recovery over time')
fig.update(layout_coloraxis_showscale=False)
fig.show()
InΒ [Β ]:
full_latest_grouped = full_latest.groupby('Country/Region')[[ 'Confirmed', 'Deaths', 'Recovered','Active','Lat','Long']].sum().reset_index()
InΒ [Β ]:
full_latest_grouped.head()
Out[Β ]:
Country/Region Confirmed Deaths Recovered Active Lat Long
0 Afghanistan 36263 1269 25198 9796 33.93911 67.709953
1 Albania 4880 144 2745 1991 41.15330 20.168300
2 Algeria 27973 1163 18837 7973 28.03390 1.659600
3 Andorra 907 52 803 52 42.50630 1.521800
4 Angola 950 41 242 667 -11.20270 17.873900
InΒ [Β ]:
# Importing necessary libraries
import folium

# Create a map
m = folium.Map(location=[54, 15], tiles='openstreetmap', zoom_start=2)

# Add points to the map
for idx, row in full_latest_grouped.iterrows():
    folium.Marker([row['Lat'], row['Long']], popup=str(row['Confirmed'])).add_to(m)

# Display the map
m
Out[Β ]:
Make this Notebook Trusted to load map: File -> Trust Notebook
InΒ [Β ]:
# Create a map
m = folium.Map(location=[54, 15], tiles='openstreetmap', zoom_start=2)

# Add points to the map
for idx, row in full_latest_grouped.iterrows():
    folium.Marker([row['Lat'], row['Long']], popup=str(row['Recovered'])).add_to(m)

# Display the map
m
Out[Β ]:
Make this Notebook Trusted to load map: File -> Trust Notebook
InΒ [Β ]:
# Create a map
m = folium.Map(location=[54, 15], tiles='openstreetmap', zoom_start=2)

# Add points to the map
for idx, row in full_latest_grouped.iterrows():
    folium.Marker([row['Lat'], row['Long']], popup=str(row['Deaths'])).add_to(m)

# Display the map
m
Out[Β ]:
Make this Notebook Trusted to load map: File -> Trust Notebook
InΒ [Β ]:
# Importing necessary libraries
from folium.plugins import HeatMap  # Import HeatMap

# Create map
m = folium.Map(location=[54, 15], zoom_start=2)

# Add HeatMap layer
HeatMap(data=full_latest_grouped[['Lat', 'Long']], radius=15).add_to(m)

# Show the map
m
Out[Β ]:
Make this Notebook Trusted to load map: File -> Trust Notebook
InΒ [Β ]:
# Importing necessary libraries
import folium
from folium.plugins import MarkerCluster  # Import MarkerCluster
import math

# Create map
m = folium.Map(location=[54, 15], tiles='cartodbpositron', zoom_start=2)

# Add points using MarkerCluster
mc = MarkerCluster()
for idx, row in full_latest_grouped.iterrows():
    if not math.isnan(row['Long']) and not math.isnan(row['Lat']):
        mc.add_child(folium.Marker([row['Lat'], row['Long']]))  # Use folium.Marker

m.add_child(mc)

# Display the map
m
Out[Β ]:
Make this Notebook Trusted to load map: File -> Trust Notebook
InΒ [Β ]:
# Create map with overall cases registered
m = folium.Map(location=[54,15], zoom_start=2)
HeatMap(data=details[['Lat', 'Long']], radius=15).add_to(m)

# Show the map
m
Out[Β ]:
Make this Notebook Trusted to load map: File -> Trust Notebook
InΒ [Β ]:
# Create map
m = folium.Map(location=[54, 15], tiles='cartodbpositron', zoom_start=2)

# Add points using MarkerCluster
mc = MarkerCluster()
for idx, row in details.iterrows():
    if not math.isnan(row['Long']) and not math.isnan(row['Lat']):
        mc.add_child(folium.Marker([row['Lat'], row['Long']]))

m.add_child(mc)

# Display the map
m
Out[Β ]:
Make this Notebook Trusted to load map: File -> Trust Notebook
InΒ [Β ]:
details.to_csv('covid_19_clean_complete_with_ratios.csv', index=False)
InΒ [Β ]:
details.head(10)
Out[Β ]:
Province/State Country/Region Lat Long Date Confirmed Deaths Recovered Active WHO Region Year Month Active Ratio Fatality Rate Fatality_Ratio total_cases
0 Afghanistan 33.93911 67.709953 2020-01-22 0 0 0 0 Eastern Mediterranean 2020 1 NaN NaN NaN 0
1 Albania 41.15330 20.168300 2020-01-22 0 0 0 0 Europe 2020 1 NaN NaN NaN 0
2 Algeria 28.03390 1.659600 2020-01-22 0 0 0 0 Africa 2020 1 NaN NaN NaN 0
3 Andorra 42.50630 1.521800 2020-01-22 0 0 0 0 Europe 2020 1 NaN NaN NaN 0
4 Angola -11.20270 17.873900 2020-01-22 0 0 0 0 Africa 2020 1 NaN NaN NaN 0
5 Antigua and Barbuda 17.06080 -61.796400 2020-01-22 0 0 0 0 Americas 2020 1 NaN NaN NaN 0
6 Argentina -38.41610 -63.616700 2020-01-22 0 0 0 0 Americas 2020 1 NaN NaN NaN 0
7 Armenia 40.06910 45.038200 2020-01-22 0 0 0 0 Europe 2020 1 NaN NaN NaN 0
8 Australian Capital Territory Australia -35.47350 149.012400 2020-01-22 0 0 0 0 Western Pacific 2020 1 NaN NaN NaN 0
9 New South Wales Australia -33.86880 151.209300 2020-01-22 0 0 0 0 Western Pacific 2020 1 NaN NaN NaN 0

Spread of Cases, as requested by sir.

InΒ [7]:
details_grouped = details.groupby(["Date", "Country/Region", "Lat", "Long"]).sum().reset_index()
dates = sorted(details_grouped["Date"].unique())

# Remove duplicate countries to get one lat-long per country
country_coords = {}

# This picks the first occurrence of each country with its Lat and Long
for _, row in details.drop_duplicates(subset=['Country/Region']).iterrows():
    country_coords[row['Country/Region']] = (row['Lat'], row['Long'])
InΒ [11]:
# Ensure Date is datetime
details['Date'] = pd.to_datetime(details['Date'])

# Filter rows with >0 cases
infected_details = details[details['Confirmed'] > 0]

# Find first infection date per country
first_case = infected_details.groupby('Country/Region')['Date'].min()
InΒ [12]:
edges = []

countries = list(first_case.index)

for target in countries:
    target_date = first_case[target]
    target_lat, target_lon = country_coords[target]

    # Get possible sources: countries that had cases before target
    possible_sources = [src for src in countries if first_case[src] < target_date]

    if not possible_sources:
        continue  # skip first infected country

    # Pick nearest source country
    min_dist = float('inf')
    source_country = None

    for src in possible_sources:
        src_lat, src_lon = country_coords[src]
        dist = ((src_lat - target_lat)**2 + (src_lon - target_lon)**2)**0.5  # simple Euclidean

        if dist < min_dist:
            min_dist = dist
            source_country = src

    # Save the edge: (src_lat, src_lon, tgt_lat, tgt_lon, tgt_date)
    if source_country:
        slat, slon = country_coords[source_country]
        tlat, tlon = target_lat, target_lon
        spread_date = target_date

        edges.append((slat, slon, tlat, tlon, spread_date))
InΒ [13]:
import pandas as pd
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
from matplotlib.animation import FuncAnimation

# Sample data:
# country_first_case = {'CountryA': dateA, 'CountryB': dateB, ...}
# edges = [(source_lat, source_lon, target_lat, target_lon, spread_date), ...]

# Load your data and compute country_first_case and edges here

fig, ax = plt.subplots(figsize=(12, 6))
m = Basemap(projection='mill', llcrnrlat=-60, urcrnrlat=85,
            llcrnrlon=-180, urcrnrlon=180, ax=ax)

def update(frame):
    ax.clear()
    m.drawcoastlines()
    m.drawcountries()

    current_date = dates[frame]
    ax.set_title(f'COVID-19 Spread on {current_date.date()}')

    # Draw countries as dots
    for country, (lat, lon) in country_coords.items():
        x, y = m(lon, lat)
        ax.plot(x, y, 'bo', markersize=5)

    # Draw directional arrows from source to target
    for (slat, slon, tlat, tlon, spread_date) in edges:
        if spread_date <= current_date:
            x_start, y_start = m(slon, slat)
            x_end, y_end = m(tlon, tlat)

            ax.annotate(
                '',
                xy=(x_end, y_end), xycoords='data',
                xytext=(x_start, y_start), textcoords='data',
                arrowprops=dict(
                    arrowstyle="->",
                    color='red',
                    lw=2,
                    alpha=0.7
                )
            )


ani = FuncAnimation(fig, update, frames=len(dates), interval=500)
ani.save('covid_spread_lines.mp4', writer='ffmpeg')
No description has been provided for this image

ConclusionΒΆ

  • In this project, we analyzed the impact of COVID-19 by aggregating and visualizing key metrics such as confirmed cases, deaths, recoveries, and active cases. By computing global fatality rates and deaths per 100 confirmed cases, we gained insights into the severity of the pandemic over time. The use of data visualization techniques helped highlight trends and patterns in the outbreak. This study reinforces the importance of real-time data monitoring and analysis in managing public health crises. Future work can include predictive modeling to forecast case trends and evaluate the effectiveness of mitigation measures.